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Abstract 

The formalism by Press and Schechter (PS) is often used to infer number densities 
of virialized objects of mass M (e.g. quasars, galaxies, clusters of galaxies, etc.) from a 
count of initially overdense regions in a Gaussian density perturbation field. We reanalyze 
the PS-formalism by explicitly counting underdense regions which are embedded within 
overdense regions, so called cloud-in-clouds. In contrast to the original PS-formalism, our 
revised analysis automatically accounts for all the cosmic material. We find that mass 
distribution functions for virialized objects are altered by the proper solution of the cloud- 
in-cloud problem. These altered distribution functions agree much better with distribution 
functions inferred from N-body simulations than the original PS-distribution functions. 
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1. Introduction 

In our current understanding the presently observed structure of the universe has 
formed by the gravitational growth of small-amplitude density perturbations extant at the 
epoch of matter-radiation equality at high redshift {z ^ 1000). It is well known that a 
detailed comparison between quantities of the observed structure (e.g. spatial and angular 
correlation functions, the peculiar velocity field, the number densities of quasars, damped 
Lyman-CK clouds, and galaxy clusters) and predictions of these quantities in specific galaxy 
formation scenarios can yield valuable information about the matter content of the universe 
and the nature of the initial density perturbations. For a reliable prediction of the structure 
formed in specific cosmic scenarios one has, in principle, to resort to detailed N-body 
simulations. In practice, however, such N-body simulations may be rather time-consuming 
and can currently only resolve a limited dynamic range of mass. 

A complimentary and analytic approach to the problem of structure formation has 
been developed by Press & Schechter 1974 (hereafter; PS). In their analysis the number 
densities and masses of virialized objects is directly inferred from the statistics of the initial 
density perturbation field in a manner to be illustrated in the following section. The PS- 
prediction for the fraction of cosmic matter contained in virialized and separate halos of 
mass M, so called multiplicity- or mass- functions, is in fairly good agreement with results 
from N-body calculations indicating the general validity of the PS-approach. The PS-mass 
functions have been applied to predict the number densities of galaxies (Cole & Kaiser 
1989; White & Frenk 1991; Kauffmann, White, & Guiderdoni 1993), clusters of galaxies 
(Cole & Kaiser 1988; Zhan 1990; Bartlett & Silk 1993), quasars (Carlberg 1990; Nusser 
& Silk 1993), Lyman-o; clouds (Kauffmann & Chariot 1994; Mo & Miralda-Escude 1994), 
and dark halos (Narayan & White 1988) which emerge in different structure formation 
scenarios and at different redshifts. Recently the analysis by PS has also been extended 
to estimate the frequency of halo mergers and to predict details about the substructure of 
clusters of halos (Bower 1991; Bond et al. 1991; Kauffmann & White 1993; Lacey & Cole 
1993; 1994). 

It has been noted, however, that the original analysis by PS is strictly speaking in- 
correct since it only associates half of the mass of the initial density perturbation field 
with eventually to be virialized and gravitationally self-bound structures. PS achieved 
proper normalization of the mass functions by simply multiplying results by a factor of 
two. The remaining half of cosmic material, which is not automatically accounted for in the 
PS-formalism, is material initially present in underdense regions. It is believed that this 
remaining half of material is accounted for if a proper solution to the cloud-in-cloud prob- 
lem is found, in particular the problem of correctly counting underdense regions which are 
embedded within overdense regions. The cloud-in-cloud problem has also plagued other 
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analytic approaches to the subject of structure formation, such as the peaks formalism 
(Peacock & Heavens 1985; Bardeen et al. 1986). 

Bond et al. (1991) proposed a solution to the cloud-in-cloud problem by considering 
excursion set mass functions. They were able to obtain analytic results for the mass func- 
tions if they applied sharp A;-space filtering. In sharp A;-space filtering average densities 
within a given volume are computed only from those Fourier modes of the density distribu- 
tion which have wavelength larger than the characteristic dimension of the volume (refer 
to the next section). Their analysis accounts for all the cosmic material. Surprisingly, 
their analysis recovers the original PS-mass functions including the normalization factor of 
two. A similar result has been obtained by Peacock and Heavens (1990). When different 
filtering functions are considered (e.g. top hat or Gaussian filters in coordinate space) the 
shapes of mass functions may be altered. 

In this paper we reanalyze the cloud-in-cloud problem in the context of the PS- 
formalism. The method outlined is an extension of the original PS-analysis which reduces 
to the PS-prescription when the cloud-in-cloud problem is (incorrectly) ignored. We find 
that a proper treatment of the cloud-in-cloud problem changes the shapes of mass func- 
tions, in contrast to the conclusions of Bond et al. (1991) and Peacock & Heavens (1990). 
We show that the renormalized mass functions are in much better agreement with results 
from N-body simulations than the original PS mass functions. 

2. The cloud-in-cloud problem and mass distribution functions 

It is well known that length and amplitude of small-amplitude density perturbations 
(i.e. amplitude S = {p — p)/p « 1; where p is the mass density of the perturbation 
and p is the average cosmic mass density) grow proportionally to the cosmic scale factor 
during a matter-dominated epoch. When reaching non-linearity (5 ~ 1) perturbations 
recoUapse and virialize. (cf. Kolb & Turner 1989). In the absence of dissipative processes 
fiuctuations will maintain their sizes and densities thereafter. Perturbations which are 
overdense by more than some critical amount Sc{z) at the epoch of matter-radiation density 
equality will have recoUapsed before the epoch with redshift z. The critical amplitude 5c 
is approximately 5c ~ 1.68(1 -|- z)/{l + Zcq) (cf. Kolb & Turner 1989). Here z^q is the 
redshift of matter-radiation density equality. 

One can probe the initial density perturbation field by determining the fraction of 
space which is overdense by the critical amount 5c when averaged over a spherical volume 
V (or, equivalently, on a mass scale M = pV). Presently most structure formation sce- 
narios assume the existence of Gaussian density perturbations as the seeds for the cosmic 
structure. In this case, the probability f{5c,M) to find a region of mass scale M to be 
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overdense by more than the critical amount 5c is simply given by an integral over the tail 
of a Gaussian distribution function 
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f{6c, M) = J 



1 1 




1 

2(j2(M) 



dS . 
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The width of this Gaussian distribution function, cr(M), can be inferred from the 
statistics of the Fourier modes of the density distribution. For Gaussian random fluctu- 
ations the magnitude of the amplitudes of the Fourier modes \dk\ are distributed accord- 
ing to a half-Gaussian. The (pk follow a flat distribution in the interval (pk ^ [0, 27r] so 
that different Fourier modes have uncorrelated phases (pk- In this case, the overdensity 
d{x) — 2^^ \Sk\cos{kx + (pk) at some coordinate x is the sum over several uncorrelated 
terms. By the central limit theorem the distribution function for 6 will then be a Gaussian 
centered around value zero. 

If one wishes to determine the average density 6m (x) on a mass scale M (or in a volume 
V = M/ p) only those Fourier modes should be added which have wavelengths {2n/k) larger 
than the characteristic dimension of the volume. Such a prescription corresponds to sharp 
/c-space filtering. It is expected that the contributions from Fourier modes which have 
wavelengths shorter than the characteristic dimension of the volume will average out to 
zero. 

The variance a{M) of the resultant distribution function for Sm can be computed 
from the variance (l^fcp) of the half-Gaussian distribution function for \5k\- Here the 
square brackets () denote an ensemble average. The ensemble includes different universes 
with different randomly chosen Sk and (pk distributed according to the half-Gaussian- and 
flat distribution functions, respectively. Note, that we assume implicitly that an ensemble 
average is equivalent to a spatial average. Then 



In this expression L^k'^dk/ (27r)^ counts the number of different Fourier modes and L is the 
length of the fundamental cube of the simulation. (Results will emerge independent from 
L.) Bond et al. (1991) have noted that equation (2a) is completely analogous to computing 
the mean-square-distance (d^) (corresponding to cr^(M)) which a particle random walking 
in one dimension will have moved away from it's initial position, given the expectation 
value of the square of the particle's mean free path (P) (corresponding to (|(^feP)) and the 
number of scatterings N (corresponding to the number of Fourier modes). In particular 
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The sharp A;-space cutoff kc in equation (2a) is determined by the mass of the probing 
volume M 



The relationship between the cutoff wavenumber kc and the mass M is ambiguous up to 
factors of unity. This is because it is not evident if the diameter of a spherical volume 
V should be set equal to either the cutoff wavelength or, for example, half the cutoff 
wavelength. This ambiguity results in an uncertainty of what mass should ultimately be 
associated with an initially overdense region. It does, however, not affect the shape of the 
resulting mass function. Others have chosen a value of k^ = 6it^{M/ p)~^ (Lacey & Cole 
1994). 

One often assumes a featureless power-law for the variance of the magnitude of the 
Fourier amplitude, i.e. (|5fcp) = L~^Ak'^. In this case, equation (2a) yields 



Here is the mass scale of unity variance a"^(M^) = 1. The coefficient A and the exact 
relationship between mass and cutoff momentum kc{M) are included in M^. 

In specific scenarios of structure formation and if one desires to work over extended 
ranges of mass the featureless power-law should be replaced by the appropriate functional 
form for (|(^feP) which, in general, is more complicated. The dependence of (|(^feP) on 
k determined at the epoch of matter-radiation density equality for a variety of cosmic 
scenarios can, for example, be taken from the paper by Holtzmann (1989). Equation (2a) 
can then be numerically integrated to yield an analogous relationship to equation (3). 

It is now our goal to compute the number density of isolated regions with an over- 
density 5 exceeding the critical one, 5 > Sc. Isolated regions overdense by 6c or more 
are regions which are not included in yet larger regions overdense by Sc or more. In par- 
ticular, a region of mass scale M will only be counted as an eventually virialized object 
of mass M, if for any larger mass scale the average overdensity of the region is below 
the threshold. Consider a set of mass scales (Mi, M2, M3, ...) which arc ordered by size 
(Ml < M2 < M3 < ...). Furthermore, consider a large representive cosmic volume L^. The 
total volume, L^f{6c,Mi), which is found to be overdense by dc or more on the smallest 
scale Ml is given by the number of isolated objects of scale Mi, L^n*(Mi), times the 
size of an individual object's volume, Vi = Mi/p, plus the contributions from finding 
overdense regions on scale Mi included in larger isolated regions (M2, M3, ...). The latter 
contribution, for example from isolated regions on scale M2, is given by the product of 
number of isolated regions on scale M2, L^n*(M2), the volume of these isolated regions, 
V2 = M2/P, and the probability -P(Mi, M2) of finding a region of scale Mi overdense by 5c 
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or more, provided it is included in an isolated, overdense region of size M2. In the original 
PS- analysis it was erroneously assumed that this probability is unity. In a similar manner 
isolated regions of even larger sizes make contributions to L^f{5c, M). 
When we add up all contributions we find 

00 

/(<5c,Mi) = ^ VMin*(Mi)P(Mi,Mi) , (4a) 

where the limiting probability limMi^M2 PiMi, M2) is understood to be unity. Equation 
(4a) can be rewritten into integral form 

1 

f{Sc,Mi) = - / dM' M'n{M')P{Mi,M') . (46) 
P J Ml 

Here n{M) is the number density of isolated and overdense regions per unit mass inter- 
val. This is exactly the desired mass function which PS intended to derive. Completely 
analogous equations to equations (4ab) can be written down for any mass scale Mj. The 
resulting set of equations then represent a matrix equation, such that the product of ma- 
trix P{Mj, Mi) and vector Min*{Mi) has to equal the vector f{Sc, Mj). This matrix has a 
form which is already suitable for Gaussian back substitution. This is because the n{Mj) 
only depend on those n{Mi) with Mj > Mj and so all lower off-diagonal elements of the 
matrix Pji = P{Mj,Mi) are zero (i.e. Pji = 0, for j > i). To obtain n{M) one has, in 
principle, to solve an infinitely large matrix equation. In practice, however, one can obtain 
very good approximations to n{M) by truncating the matrix for large masses. The number 
density of isolated, overdense regions with large masses is exponentially suppressed so that 
the upper, infinite bound in equations (4ab) can be replaced by some large mass Mtmnc- 
To fully appreciate the modification to the original PS-formalism one can take the 
derivative of equation (4b) with respect to mass Mi. This yields 



dM 



= _lMn(M) + i / dM'M'niM')'-^^^ . (5) 
P PJm ^ ' dM 



Note, that by neglecting the second term on the right-hand side of equation (5), the original 
PS-prescription is recovered. However, this term does not vanish since the probability 
P{M, M') does vary with M. 

The probability P{M, M') to find a region of size M overdense by 5c or more when it 
is included in a larger region M' overdense by 5c or more can be computed in the following 
way. The overdensity 5' of the large region is determined by the sum of the amplitudes of 
all the Fourier modes between A; = and the cutoff k = k'^, where the cutoff is computed 
from equation (2b) for the mass M'. Substructure within the large region, that is the 
overdensity (underdensity) for any smaller region M contained within M' relative to the 
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overdensity 5' , is determined by the sum of the amphtudes of all the Fourier modes which 
have wave numbers between k'^ and kc- Here kc is computed for the smaller mass M. The 
probability distribution for the overdensity 5 of the smaller region will follow a Gaussian 
distribution centered at the larger region's overdensity 5' . If one assumes a power-law for 
the variances the width Usuh of this Gaussian can be obtained from equation (2a). 

This is accomplished if one replaces the lower integral limit in equation (2a) by k'^ 

The probability P{M, M') is then computed by a double integral. The first integral 
is performed over all possible larger region's overdensities having 6' > 6c- The second 
integral computes for any given 6' the probability of finding a smaller region's overdensity 
to exceed the threshold 5 > 5c as well. This latter integral is obtained by integrating over 
the Gaussian distribution of the smaller region's overdensity. We find 



(7a) 



where the constant N assures proper normalization 

The width cr' of the larger region's overdensity distribution is given by equation (3) with 
M replaced by M'. In the limit M — > M' the width a sub approaches zero and the sec- 
ond integral in eq(7a) becomes a delta function. This yields a limiting probability of 
limM-»M' P{M,M') = 1. In the opposite limit when M approaches zero the probability 
P{M,M') approaches 1/2. 

It is the goal to solve the implicit equation (5) for n{M). We have derived a rather 
lengthy expression for dP{M,M')/dM which, however, is more suitable for numerical 
integration than equation (7a) 
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Here erfc(a;) is the complimentary error function 
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erfc(a;) 



e-* dt . 



The derivative dasuh/dM can be obtained from equation (6). 

It is now straightforward to solve equation (5) for n{M) with the help of equation 
(1) and equation (8). This is done by assuming a very large mass Mtmnc for which 
df{dc, M)/dM is exponentially small. For such a large mass one obtains a good approxi- 
mation to n{Mtrunc) by only considering the first term on the right-hand-side of equation 
(5). To compute n{M) for smaller masses the full equation (5) should be employed. This 
is easily done if one derives n(M) for successively smaller masses by using previously de- 
termined n{M) for all the larger masses. In this way the shape of the original PS- mass 
function will be altered. 

We have numerically solved equation (5) to obtain the multiplicity function. (We will 
give an analytic approximation below.) The multiplicity function is the fraction of matter 
dp{M)/p contained in virialized objects of mass M per unit logarithmic mass interval 
cilog2M. The multiplicity function is related to n{M) by 



Results for multiplicity functions are presented in figure (1). In this figure, the upper 
panel shows the multiplicity function for a spectral index n = whereas the lower panel 
shows the multiplicity function for a spectral index of n = —1. The heavy solid lines show 
the renormalized multiplicity functions derived in this paper. For comparison the light 
solid lines represent the original PS-results. The figure also shows multiplicity functions 
inferred from N-body simulations. These have been taken from Efstathiou et al. (1988) (cf. 
figs. 9 in Efstathiou et al.). The multiplicity functions inferred from N-body simulations 
are shown for the last seven output times of the simulation and are scaled according to self- 
similarity of the multiplicity functions at different times. The up-turn of the multiplicity 
functions for lower multiplicities or, equivalently, masses arises from the limited resolution 
of the N-body simulations. 

The figure illustrates that the multipicity functions derived from equation (5), which 
include the solution of the cloud-in-cloud problem, provide a much better approximation 
of the N-body results than the original PS-multiplicity functions. Over the limited range 
of masses (multiplicities) shown in figure (1) the differences between the PS-result and the 
result from equation (5) are within a factor of two. 

In figures (2abc) we show the multiplicity functions on a double-logarithmic plot 
over an extended mass range. The solid lines represent the renormalized multiplicity 



1 dp{M) _ 1 
p dXog^M p 



M^n(M)\oge2 . 
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functions, whereas the dashed hnes show the original PS multiphcity functions. It is 
evident that the differences between these two multiphcity functions can be as large as 
an order of magnitude at the small-mass scale side of the distribution. In particular, 
the PS-analysis systematically underestimates the number densities of small mass objects. 
This discrepancy between our result and the PS-approximation is particularly large for a 
positive spectral index. On the large-mass scale end of the distribution the PS multiplicity 
functions overestimate the number densities of objects by a factor of two independent of 
the spectral index. However, this factor-of-two difference can be easily absorbed into the 
uncertainties associated with the absolute mass scale of objects (cf. equation 2b). This is 
because there is no change in the shape of the multiplicity functions at the rare, massive- 
object side of the distribution and multiplicity functions decrease very rapidly for large 
masses. 

The modification of the mass distribution functions presented here does not affect 
the self-similarity of mass distribution functions at different times in hierarchical structure 
formation. In particular, as described by PS the shapes of the mass functions are indepen- 
dent of epoch or, equivalently, threshold Sc- Mass functions differ only for different 5c by 
an increase in the characteristic mass scale of the distribution functions with decreasing 
threshold Sc- 

The formalism developed naturally assigns all the cosmic material to overdense regions 
which will eventually form virialized objects. Thus, the missing half of cosmic material, 
which could not be accoimted for in the PS-formalism, is accounted for once a solution to 
the cloud-in-cloud problem is known. This can be seen by considering a mass scale M very 
much smaller than the characteristic mass scale of the peak of the multiplicity function. In 
the limit when M approaches zero both f{6c, M) and P{M, M') approach one- half. With 
the help of equation (4b) one can then infer 

lim - / dM'M'n(M') = 1 , 

so that the computed n{M) will automatically have the proper normalization. 

Our results are in direct contradiction with the results obtained by Bond et al. (1991) 
and Peacock & Heavens (1990). Their analysis confirmed the original PS-result. Bond et 
al. consider a set of different-sized mass scales. For each point in space they determine the 
largest of these mass scales M for which the region around the point is overdense by the 
critical amount. They then associate such a region with an eventually self-bound object of 
mass M. They perform such an analysis independently for each space point by considering 
all possible realizations of Fourier-mode amplitudes. Their analysis is incorrect, however, 
since it does not allow for correlations between neighboring space points. In other words, 
if their analysis finds an object of mass M at point x it will also find an object of mass 
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M + dM at point x + dx and it will count both as separate objects. In reality, the object 
of mass M at a; will have a finite size and will include the space point at x + dx, so that 
only one object of mass M should be counted. Note, that our treatment implicitly includes 
spatial correlations by the determination of the probability function P{M, M'). 

We have not been able to derive an analytic solution for n{M). It is not always 
convenient to have to invert a matrix for a determination of multiplicity functions and 
number densities of virialized objects. We therefore obtained an analytic approximation 
for n(M) 

n(M) ^ inps(M)(l - O.GSy"^ + 1.17^"^) , (11a) 

with 

1 I n 

Here nps(M) is the original PS-result, in particular 

The multiplicity functions derived by using the analytic approximation equation (11a) are 
shown in figures (2abc) by the dotted lines. Over the whole range of masses shown in these 
figures and for all three spectral indices equation (11a) approximates the renormalized 
number densities by better than ten per cent. If higher accuracies are desired a numerical 
treatment has to be performed. 

3. Conclusions 

We have reanalyzed the Press-Schechtcr formalism which is often used to estimate 
number densities of virialized objects from the statistics of the primordial density perturba- 
tions. Our result addresses the importance of the cloud-in-cloud problem, in particular the 
correct count of underdense regions which are embedded within over dense regions. In con- 
trast to previous work we find that by considering a proper solution to the cloud-in-cloud 
problem mass distribution functions are modified. These renormalized mass distribution 
functions provide an excellent approximation of mass distribution functions inferred from 
N-body simulations. Our analysis shows that the original PS-formalism systematically 
overestimates the number of rare, massive objects by a factor of two, while the number 
of low-mass objects are underestimated by up to an order of magnitude. The approach 
described here avoids this. 
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6. Figure Captions 



Figure 1 The multiplicity function, i.e. the fraction of cosmic material contained in virialized 
and isolated objects of mass M per unit logarithmic mass interval (cf. equation 9) as 
a function of log2M. The absolute value of mass on the ordinate is arbitrary. The 
heavy solid line shows the result derived in this paper. The light solid line shows 
the original PS-result. For comparison the solid (up-turning), dotted, short-dashed, 
long-dashed, short-dashed dotted, long-dashed dotted, and short-dashed long-dashed 
lines give results from N-body simulations. The N-body results have been taken from 
Efstathiou et al. (1988). The upper panel shows the multiplicity function for a power- 
law spectral index of n = -|-1 for the variance of the Fourier mode amplitudes of the 
density distribution, whereas the lower panel shows the multiplicity function for a 
spectral index of n = — 1. 

Figure 2a The multiplicity function {1 / p){dp{M) / dlog2M) presented double-logarithmically as a 
function of rescaled mass Sc^^^^^^ (M/M^). The solid line shows the renormalized mul- 
tiplicity function, whereas the dashed line shows the original PS- multiplicity function. 
The dotted line represents the multiplicity function computed by using the analytic 
approximation equation (11). The calculation assumes a spectral index of n = -|-1. 

Figure 2b Same as figure (2a), but here a spectral index of n = is assumed. 

Figure 2c Same as figure (2a), but here a spectral index of n = — 1 is assumed. 
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